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Networks may, or may not, be wired to have a core that is both itself densely connected and central in terms 
of graph distance. In this study we propose a coefficient to measure if the network has such a clear-cut core- 
periphery dichotomy. We measure this coefficient for a number of real-world and model networks and find 
that different classes of networks have their characteristic values. For example do geographical networks have 
a strong core-periphery structure, while the core-periphery structure of social networks (despite their positive 
degree-degree correlations) is rather weak. We proceed to study radial statistics of the core, i.e. properties of the 
n-neighborhoods of the core vertices for increasing n. We find that almost all networks have unexpectedly many 
edges within ^-neighborhoods at a certain distance from the core suggesting an effective radius for non-trivial 
network processes. 



PACS numbers: 89.75.Fb, 89.75.Hc 
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[. INTRODUCTION 

All systems consisting of pairwise-interacting entities can 
be modeled as networks. This makes the study of complex 
networks one of the most_general and interdisciplinary areas 
of statistical physics Q ll 2t l33l) . One of the most impor- 
tant gains of the recent wave of statistical network studies is 
the quantification of large-scale network topology fuj l33l) . 
Now, with the use one just a few words and numbers, one 
can state the essential characteristics of a huge network — 
characteristics that also say something about how dynami- 
cal systems confined to the network will behave. A possi- 
ble large-scale design principle is that one part of the net- 
work constitutes a densely connected core that also is cen- 
tral in terms of network distance, and the rest of the network 
forms a periphery. In, for example, a network of airline con- 
nections you would most certainly pass such a core-airport on 
any many-flight itinerary. It is known that a broad degree dis- 
tribution can create a core having these properties (9). In this 
paper we address the question if there is a tendency for such 
a structure in the actual wiring of the network. I.e., if one as- 
sumes degree to be, to a large extent, an intrinsic property of 
the vertices, then is the network organized with a distinct core- 
periphery structure or not? To give a quantitative answer to 
this question our first step is to find a core with the above men- 
tioned properties — being highly-interconnected and having a 
high closeness centrality ( I4ll) (the inverse average distance be- 
tween a vertex in the core and an arbitrary vertex). Once such 
a subgraph is identified we calculate its closeness centrality 
relative to the graph as a whole, and subtract the correspond- 
ing quantity for the ensemble of random graphs with the same 
set of degrees as the original network (cf. Ref. (29)). If the re- 
sulting coefficient is positive the network shows a pronounced 
core -periphery structure. Once the core and periphery are dis- 
tinguished one may proceed to investigate their structure. We 
look at the statistical properties of the n-neighborhoods (the 
set of vertices on distance n) of the core vertices. By such ra- 
dial statistics we can draw conclusions for the respective func- 
tion of the core and periphery. This paper starts by defining 
the core-periphery coefficient and measure it for real-world 
networks of numerous types, then proceeds by discussing and 



measuring radial statistics. 



II. MEASURING THE CORE-PERIPHERY STRUCTURE 

In this paper we assume the network to be represented as 
a graph G = (V, E) with a set V of N vertices and a set E 
of M undirected and unweighted edges. (It is straightforward 
to generalize our analysis to weighted networks.) Since our 
analysis requires the network to be connected we will hence- 
forth identify G with the largest connected component of the 
network (in all mentioned cases this component will consti- 
tute almost the entire network). We also remove self-edges 
and multiple edges. 



A. Rationale and definition of the core-periphery coefficient 

The notion of network centrality is a very broad and many 
measures have been proposed to capture different aspects of 
the concept (Op). One of the simplest quantities is the closeness 
centrality ( 1411) 



C c (0 = ((d(i,j))jev\{i)) 



(1) 



of a vertex i, where d(i, j) is the distance between i and j (the 
smallest number of edges on a path from i to j). The closeness 
of a vertex is thus the reciprocal average shortest distance to 
the other vertices of V. This definition is straightforwardly 
extended to a subset U of vertices 



cc(u) = ({{d(i,j)) jevm ). EU y 



(2) 



So we require a core to be a subgraph U with high Cc(U), 
but also to be a well-defined cluster — i.e. to have compar- 
atively many edges within. Now, if there are many facets 
of the centrality concept, there are even more algorithms 
to identify graph clusters, each being a de facto cluster- 
definition I l33l) . For simplicity we choose the most rudimen- 
tary cluster-definition — the set of fe-cores. A fe-core is a max- 
imal subgraph with the minimum degree k (maximal in the 
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TABLE I The network sizes N and M, the core-periphery coefficient c cp and the relative assortative mixing coefficient Ar for a number of 
networks. In the interstate network the vertices are American interstate highway junctions and two junctions are connected if there is a road 
with no junction in between. The pipeline network is a similar network of junctions and gas pipes. In the street networks the vertices are 
Swedish city street segments connected if they share a junction. In the airport data (obtained from IATA www.iata.org) the vertices are airports 
and the edges represent airport pairs with a non-stop flight connection. The Internet figures are averages of 15 AS-level graphs constructed from 
traceroute searches. The arXiv, board of directors and Ajou students are constructed one-mode projections from affiliation networks (where 
links goes from persons to e-prints, corporate boards and university classes respectively). The student network is averaged over graphs for 16 
semesters. One edge represent two students taking at least three classes together that semester. The high school, prisoner and social scientist 
networks are gathered from questionnaires — an edge means that two persons have listed each other as acquaintances. The high school data are 
averaged over 84 individual schools. In the electronic communication networks one edge represent that at least one of the vertices has contacted 
the other over some electronic medium. In the nd.edu data the vertices are HTML documents and the edges are hyperlink. The citation graph 
is constructed from preprints in the field of high-energy physics Q) (see: http://www.cs.cornell.edu/projects/kddcup/datasets.html l. In the 
software dependency graph the vertices are software packages and an edge means that one package needs the other for its proper function. 
The food webs are networks of water-living species and an edge means that one species prey on the other. For the protein networks an edge 
means that two proteins bind to each other physically. The metabolic and "whole cellular" networks consist of chemical substances and edges 
indicating that one molecule occur in the same reaction as the other (the values for these networks are averages over 43 organisms from 
different domains of life). 





Network 


Kel. 


A' 


M 




Ar 


Geographical networks 


Interstate highways 




935 


1315 


0.231(1) 


r\ Ann /r\ 

0.0851(5) 




Pipelines 


(19) 


2999 


3079 


0.180(2) 


0.073(2) 




Streets, Stockholm 


(40) 


3325 


5100 


0.255(1) 


0.080(1) 




Streets, Goteborg 


(40.) 


1258 


1516 


0.040(3) 


0.019(3) 




Airport 




449 


2795 


0.0523(3) 


0.0910(3) 




Internet 


Q£) 


1968(66) 


4051(121) 


0.045(2) 


0.009(3) 


One-mode projections of 


arXiv 


(30) 


48561 


287570 


-0.08(3) 


0.361(3) 


affiliation networks 


Board of directors 


Ui> 


6193 


43074 


-0.037(2) 


0.280(2) 




Ajou University students 


(24; 35) 


7285(128) 


75898(6566) 


-0.08(1) 


0.66(4) 


Acquaintance networks 


High School friendship 


(5) 


571(43) 


1078(85) 


0.006(7) 


0.19(1) 




Prisoners 


(27) 


58 


83 


-0.043(2) 


0.264(2) 




Social scientists 


(18) 


34 


265(35) 


-0.002(4) 


0.10(1) 


Electronic communication 


e-mail, Ebel et al. 


(13) 


39592 


57703 


-0.229(4) 


-0.001(4) 




e-mail, Eckmann et al. 


(14) 


3186 


31856 


-0.091(2) 


-0.034(2) 




Internet community, nioki.com 


(42) 


49801 


239265 


-0.014(2) 


0.007(2) 




Internet community, pussokram.com 


(23) 


28295 


115335 


-0.183(5) 


-0.005(5) 


Reference networks 


WWW, nd.edu 




325729 


1090108 


-0.027(3) 


-0.003(3) 




HEP citations 


27400 


352021 


-0.10(1) 


0.03(1) 


Software dependencies 


GNU / Linux 


(32) 


504 


793 


-0.155(1) 


-0.087(1) 


Food webs 


Little Rock Lake 


(28) 


92 


960 


0.005(6) 


-0.0141(6) 




Ythan Estuary 


(22) 


134 


593 


-0.020(1) 


-0.0153(9) 


Neural network 


C. elegans 


(44) 


280 


1973 


0.040(6) 


0.0222(7) 


Biochemical networks 


Drosophila protein 


(20) 


2915 


4121 


-0.035(2) 


0.003(1) 




S. cervisiae protein 


(34) 


3898 


7283 


-0.249(1) 


-0.069(1) 




S. cervisiae genetic 


(34) 


1503 


5043 


-0.0646(7) 


-0.101(1) 




Metabolic networks 


(25) 


427(27) 


1257(88) 


-0.002(6) 


0.006(1) 




Whole cellular networks 


(25) 


623(32) 


1752(103) 


-0.004(6) 


-0.001(2) 



sense that if one adds any vertex to a k-coxe, it will no longer 
have a minimal degree k). To calculate a sequence of fe-cores 
is computationally cheaper (linear in M ( 17)) than more elab- 
orate clustering algorithms. 1 So we let our core V core {G) be the 
£-core with maximal closeness and define the core-periphery 



1 One iteratively removes the vertex of currently lowest degree & ra j n , if k m i n 
is not lower than its largest value during the iterations then the remaining 
network is a k-core. 



coefficient c cp as 



c cp (G) 



C c [V mie (G)] I C c [V core (G')] 



C C [V(G)] 



C C [V(G')] 



(3) 



where @{G) is the ensemble of graphs with the same set of de- 
grees as G. The sequence of £-cores is not necessarily unique. 
We maximize Cc{U) over m seq different sequences. In prac- 
tice different runs almost always yield the same core, and the 
value of m seq seems to matter little. The m nu u elements of 
Q(G) can be obtained by randomization of G in time and space 
of the order of M II 3 91). In this paper we use m nu n = 1000 and 
'Mseq = 10 for networks with N < 5000, and m nu u = 50 and 
m seq = 3 for N > 5000. 
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The correlation of degrees at either side of an edge is an 
informative structure to study J29t l3ll l37l) . To some extent 
one can see degree-degree correlations as a local version of 
the core-periphery structure — when there are positive degree- 
degree correlations, at least some subgraphs of the network 
will have a well-defined core and periphery. Such clusters 
need not be centrally positioned in the graph as a whole, so 
while the degree-degree correlations says something about if 
the graph can be separated into densely and sparsely con- 
nected regions, the core-periphery structure gives information 
of the relative position of such regions. A common way to 
quantify the average degree-degree correlations is to measure 
the assortative mixing coefficient J3 ll> 

4(k l k 2 )-(k l +k 2 ) 2 

r — (4) 

2(k\ + k\) - {h + k 2 ) 2 

where k, is the degree of the /' th argument of a edge as it 
appear in a list of E. Now, our null-model is a random graph 
conditioned to have the same degree sequence as the original 
graph. In other words, just as for the core-periphery structure, 
we consider the deviation from our null model and measure 
the relative assortative mixing coefficient 

Ar(G) = r(G) - (r(G')) c , eg(C) . (5) 

B. Numerical results for real-world networks 

In Table [J c cp and Ar are displayed for a number of real- 
world networks. We find that the core -periphery structure and 
relative degree-degree correlations follow the different classes 
of networks rather closely. Furthermore the core-periphery 
structure and degree-degree correlations seem to be quite in- 
dependent network structures in practice. For example, ge- 
ographically embedded networks have a clear core-periphery 
structure and weakly positive degree-degree correlations; so- 
cial networks derived from affiliations have slightly nega- 
tive Ccp-values but very high Ar-values; networks of online 
communication have markedly negative c cp and rather neu- 
tral degree-degree correlations. Most geographically embed- 
ded networks have the function of transporting, or transmit- 
ting, something between the vertices. Networks with a well- 
defined core (which most paths pass through) and a periphery 
(covering most of the area) are known to have good perfor- 
mance with respect to communication times ( 1191) . Also net- 
works of airline traffic (21) and the hardwired Internet ( 1361 145T) 
are known to have well-defined cores due to traffic-flow op- 
timization. The class of one-mode projection networks (so- 
cial networks constructed by linking people that participate in 
something — movies, scientific research, etc. — together) show 
slightly negative c cp -values. This can, at least for the data sets 
of scientific coauthors J30I) and fellow students of a Korean 
university (I24u35l). be explained by that there is a grouping of 
the people on the basis of specialization (and, in student net- 
works, also in grade) and thus no well-defined core. We note 
that this group of networks have very high Ar values. The in- 
terview based acquaintance networks show rather neutral re- 
values and positive Ar suggesting that there is a degree of in- 
dependence between. This is quite similar to the one-mode 




FIG. 1 Core-periphery structure of model networks. The Barabasi- 
Albert and Watts-Strogatz networks have M = 2N. The core- 
periphery model has the parameter / colc = 0.96 (i.e. the intended core 
consists of 4% of the vertices) and y = 3. All values are averaged 
over 10 4 -10 5 network realizations. The BA-model line is a fit to an 
power-law form a + aiN^" 2 (this fit gives c cp (oo) = a = 0.004(9)). 



projections, which probably is not a coincidence — there is a 
strong correlation between acquaintance ties and the organiza- 
tions people are affiliated with. The vertices in electronic com- 
munication networks are also people but the network struc- 
tures of these are quite different; the degree-degree correlation 
is typically slightly negative, as is the core-periphery coeffi- 
cient. Information networks where the edges refer to support- 
ing information sources (our examples are a subgraph of the 
WWW and a graph of citations between papers in high-energy 
physics) can be expected to be grouped into topics, thus the 
negative c cp . The same explanation applies to the negative c cp 
of the software dependency graph. Food webs are other strat- 
ified networks where a lack of a well-defined core seems nat- 
ural. The core of the neural network of C. elegans is a clique 
(fully connected subgraph) of eight neurons, which accounts 
for positive c cp and Ar values. The biochemical networks all 
show negative c cp values and negative, or neutral, relative as- 
sortative mixing coefficients. 



C. Numerical results for network models 

In addition to the real-world networks of Table [I] we also 
measure the core -periphery coefficient for a few network mod- 
els. For simple random graphs ( 15) where N vertices are ran- 
domly connected by M edges, defining an ensemble G{N, M) 
of graphs, Q(G) is precisely the elements of G(N, M) with the 
same degree sequence as G. This means that, on average, c cp 
will be zero for random graphs. A popular network model 
is the Barabasi- Albert (BA) model (41) where the graphs are 
grown by iteratively adding new vertices with edges to old 
vertices with a probability proportional to the degree of the 
old vertices. In Fig.^we see that c cp tends to zero (or a value 
very close to zero) for BA model networks. The BA model 
has an assortative mixing coefficient r that tends to zero as 
N grows J3ll) . From this one sees that the high-degree ver- 
tices are not more interconnected than can be expected from 
their degrees, and thus that there is no preference in the ac- 
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tual wiring of the network for a well-defined core in the sense 
of the c cp -coefficient. We also investigate the Watts-Strogatz' 
small-world network model ll43h were one end of the edges 
of a circulant (|8|) is rewired with a certain probability (0.01 
in our case). Just as for the BA model c cp converges to zero 
(see Fig. [I}. This is not so surprising, in the WS model's 
starting point, the circulant, every vertex is in the same po- 
sition. The rewiring procedure does not aggregate vertices 
to a well-defined core either. Finally we construct a network 
model with a positive core-periphery coefficient. We start by 
drawing N power-law distributed random integers in the in- 
terval [2, oo), i.e. the probability for a number m to be drawn 
is proportional to nT~> ', and sort these numbers in increasing 
order: nt\, • ■ ■ , m^. These numbers are the desired degrees of 
the vertices and can be thought of as stubs, or half-edges, that 
need to be connected. Now we will attempt to make a well- 
defined core of the vertices i c ° re , • ■ ■ ,N, where ij OIE is the inte- 
ger closest to Nf core (so / cole is a parameter setting the relative 
size of the core). Then we go through the vertices i c ° K , ■ ■ ■ , N 
in increasing order and for each vertex i try to attach the stubs 
the vertices j = i + 1, ■ • • , iV (once again in increasing order) 
as long as the degree of j is less than mj. The remaining stubs 
are paired together randomly and made into edges if they do 
not form loops or multiple edges. The superfluous stubs are 
then deleted. For this model c cp indeed shows positive and 
growing values, see Fig.^ 



III. RADIAL ORGANIZATION OF NETWORKS 

A well-defined core is a useful starting point for a radial 
examination of the network. By plotting quantities averaged 
over the n-neighborhoods (the set of vertices at a distance n of 
a vertex) of the core vertices as functions of n one can get an 
idea of the respective purposes of the core and periphery. This 
kind of statistics is naturally more sensible the stronger the 
core-periphery structure is. The c cp construction identifies the 
most central well-connected core but it does not say whether 
or not the core make sense — even for slightly negative c cp - 
values this type of radial statistics may be informative. While 
authors have focused on the size of the n-neighborhoods of 
random vertices ( I26tl38l) — a useful approach to monitor finite- 
size effects that affects spreading processes such as disease 
epidemics — we will focus on quantities that we find more in- 
formative regarding the relative functions of the core and pe- 
riphery. 

To get a rough view of the radial network organization we 
plot the average degree of the vertices in the n-neighborhood 
of core-vertices as a function of n in Fig. [5] We include the 
corresponding results for our null model (random networks 
constrained to the same degree sequence as the original). The 
core vertices themselves almost always get higher average 
degree for the null-model than the real-world networks (5- 
10% higher for the networks of Fig.[5J. For the first neigh- 
borhood the situation is reversed — the real-world networks 
have higher (k) than the null-model. Then the degrees are 
decreasing monotonically; typically faster for the null-model 
networks. For the street network in Fig. |2ja) (k) decreases 



rather slowly for intermediate n; the periphery is thus rather 
homogeneous. The short average distances of the core, con- 
sisting of the streets of the city center, can be attributed to its 
central geographic position. 

One can imagine different functions of the peripheral 
vertices — either they are just conveying information, traffic, 
etc. to and from the core; or they are, just as the core-vertices, 
involved in the general network processes, only less intensely. 
To understand this we measure the average value of the quan- 
tity 

fi(i, n) = M(K n ({))/EM(K n (i)) (6) 

over the core vertices; M{K„(i)) is the number of edges within 
f s n-neighborhood K„(i) and EM(K n {i)) is the expected num- 
ber of edges in a set of vertices of the same degrees as K„(i) 
in a random graph of the same degree sequence as the original 
graph G. Calculating of EM is known to be a hard counting 
problem (0), so we have to rely on the same random sam- 
pling as for the c cp -calculation. To save time one can calcu- 
late EM(K) as the average number of edges within the original 
subgraph K at the same time as the (^(G)-sampling of the c cp 
calculation. In Fig. |3d)-(f) we diagram (yu)(n) for our three 
example networks. Since the core is constructed to be highly 
inter-connected it is no surprise that (fi) has a peak for small 
n. For the metabolic network of Fig.[2Jf) this peak is small. 
This is due to the exceptionally high degrees ~ 55 of the core 
vertices (including substrates such as H2O, ATP and ADP) — 
even in the null-model networks this set of vertices will, for 
combinatorial reasons, be highly interconnected. For interme- 
diate n the (//)-values are of the order of unity, i.e. there is no 
overrepresentation of edges between vertices at this distance 
from the core. But as n increases, (/i) grows to a sharp peak 
before it eventually drops to zero. This seems like a rather 
ubiquitous feature (at least it is present in almost all networks 
of Table|lJ. We interpret this as that the periphery has both the 
two functions listed above: To a certain distance from the core 
(defined by the peak) vertices have similar function and are for 
this reason connected (and since such small set of, probably, 
low-degree vertices is unlikely to have many interior edges 
fi becomes high); beyond this distance the network consists 
only of cycle-free branches. This dichotomy — the network 
in- and outside of the peak radius — is yet more distinct than 
the core vs. periphery as defined above. On the other hand the 
outside is functionally rather trivial and (in all cases we study) 
smaller than the inside (we believe the term "core" is more apt 
for smaller subgraphs). We note that this peak is not trivially 
related to the peak in the size of the n-neighborhood which is 
much broader and occurs for smaller n. 



IV. SUMMARY AND DISCUSSION 

In many networks the properties of vertices are heteroge- 
neously distributed, similarly one can find subgraphs with 
very different characteristics and function. Perhaps the sim- 
plest division of a network is that into a core and a periphery. 
The core concept has been used in various senses in the past; 
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Stockholm, streets WWW, nd.edu metabolic network, C. elegans 




2 4 6 8 10 12 14 16 18 20 5 10 15 20 25 1 2 3 4 5 6 



n n n 

FIG. 2 Radial statistics for three real-world networks, (a)-(c) show the average degree (k) of the ^-neighborhoods of the core vertices as a 
function of n for three real world networks: a network of streets in Stockholm, Sweden 1 40), a network of hyperlinked web-pages ( 3) and the 
metabolic network of C. elegans i25r) . Curves for our null-model — random networks with the same degree sequence as the original network — 
are included. In (a)-(c) we plot the number of edges with in the ^-neighborhood relative to the expected number of edges given the degree 
sequence of the ^-neighborhood and the graph as a whole. Lines are guides for the eyes. 



typicallyit is defined as a subgraph which is most tightly con- 
nected Q Q3) or a most central In this paper we use 
the rather strong precepts that a core should be both highly 
interconnected and central. To quantify this idea, we define 
the core as the £-core of highest closeness centrality. Then, 
to measure the strength of the tendency to have a central and 
highly connected core, we define a core-periphery coefficient 
as the normalized closeness centrality of the core minus the 
same corresponding average value for our null-model (ran- 
dom networks of the same degree sequence as the original 
network). Different types of networks have their characteris- 
tic c cp - values: Geographically embedded networks typically 
have a positive core-periphery coefficient. We explain this as 
an effect of their communication-time optimization. Social 
network, on the other hand, typically have slightly negative c cp 
values despite their positive degree-degree correlations. We 
show that c cp for model networks such as the Erdos-Renyi, 
Barabasi-Albert and Watts-Strogatz models goes to zero (or 
at least to a very small value) as the network size increases 
but, that one can construct networks with a positive c cp in the 
large system limit. Once the core of a network is found one 
can construct a radial image of the network by plotting quan- 
tities averaged over the n-neighborhoods of the core vertices 
as a function of n. One such quantity we study is [i{n, i) — the 
relative number of edges within the n-neighborhood of i to the 
expected number of edges in a subgraph of the same set of 
degrees in the null-model, (ji) shows, almost ubiquitously, a 
peak at intermediate n. We interpret this peak as an effective 
radius of the network. Much remains to be done in terms of 
characterizing the cores and peripheries of complex networks. 
We believe this dichotomy and the radial imagery we present 
are very useful tools to understand the large-scale architecture 
of such networks. 
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